10 research outputs found

    A Real-Time Gaze Estimation Framework for Mobile Devices

    Get PDF
    Tracking eyes becomes an important component to unleash new ways of human-machine interactions in augmented and virtual reality (AR/VR). To make the eye tracking system responsible, eye tracking systems need to operate at a real-time rate (\u3e 30Hz). However, from our experiments, modern gaze tracking algorithms operate at most 5 Hz on mobile processors. In this talk, we present a real-time eye tracking algorithm that operates at 30 Hz on a mobile processor. Our algorithm achieves sub-0.5° gaze accuracy, while requiring only 30K parameters, which is one to two orders of magnitude smaller than state-of-the-art algorithms

    Efficient Complex Operators for Irregular Codes

    No full text
    Complex “fat operators ” are important contributors to the efficiency of specialized hardware. This paper introduces two new techniques for constructing efficient fat operators featuring up to dozens of operations with arbitrary and irregular data and memory dependencies. These techniques focus on minimizing critical path length and loaduse delay, which are key concerns for irregular computations. Selective Depipelining(SDP) is a pipelining technique that allows fat operators containing several, possibly dependent, memory operations. SDP allows memory requests to operate at a faster clock rate than the datapath, saving power in the datapath and improving memory performance. Cachelets are small, customized, distributed L0 caches embedded in the datapath to reduce load-use latency. We apply these techniques to Conservation Cores(ccores) to produce coprocessors that accelerate irregular code regions while still providing superior energy efficiency. On average, these enhanced c-cores reduce EDP by 2 × and area by 35 % relative to c-cores. They are up to 2.5 × faster than a general-purpose processor and reduce energy consumption by up to 8 × for a variety of irregular applications including several SPECINT benchmarks.

    Qscores: Trading dark silicon for scalable energy efficiency with quasi-specific cores

    No full text
    Transistor density continues to increase exponentially, but power dissipation per transistor is improving only slightly with each generation of Moore’s law. Given the constant chip-level power budgets, this exponentially decreases the percentage of transistors that can switch at full frequency with each technology generation. Hence, while the transistor budget continues to increase exponentially, the power budget has become the dominant limiting factor in processor design. In this regime, utilizing transistors to design specialized cores that optimize energy-per-computation becomes an effective approach to improve system performance. To trade transistors for energy efficiency in a scalable manner, we propose Quasi-specific Cores, or QSCORES, specialized processors capable of executing multiple general-purpose computations while providing an order of magnitude more energy efficiency than a general-purpose processor. The QSCORE design flow is based on the insight that similar code patterns exist within and across applications. Our approach exploits these similar code patterns to ensure that a small set of specialized cores support a large number of commonly used computations. We evaluate QSCORE’s ability to target both a single application library (e.g., data structures) as well as a diverse workload consisting of applications selected from different domains (e.g., SPECINT, EEMBC, and Vision). Our results show that QSCORES can provide 18.4 ⇥ better energy efficiency than general-purpose processors while reducing the amount of specialized logic required to support the workload by up to 66%
    corecore